Convergence Acceleration via Chebyshev Step: Plausible Interpretation of Deep-Unfolded Gradient Descent
نویسندگان
چکیده
Deep unfolding is a promising deep-learning technique, whose network architecture based on expanding the recursive structure of existing iterative algorithms. Although convergence acceleration remarkable advantage deep unfolding, its theoretical aspects have not been revealed yet. The first half this study details analysis in deep-unfolded gradient descent (DUGD) trainable parameters are step sizes. We propose plausible interpretation learned step-size DUGD by introducing principle Chebyshev steps derived from polynomials. use (GD) enables us to bound spectral radius matrix governing speed GD, leading tight upper rate. rate GD using shown be asymptotically optimal, although it has no momentum terms. also show that numerically explain well. In second study, %we apply theory and Chebyshev-periodical successive over-relaxation (Chebyshev-PSOR) proposed for accelerating linear/nonlinear fixed-point iterations. Theoretical numerical experiments indicate Chebyshev-PSOR exhibits significantly faster various examples such as Jacobi method proximal methods.
منابع مشابه
Convergence diagnostics for stochastic gradient descent with constant step size
Iterative procedures in stochastic optimization are typically comprised of a transient phase and a stationary phase. During the transient phase the procedure converges towards a region of interest, and during the stationary phase the procedure oscillates in a convergence region, commonly around a single point. In this paper, we develop a statistical diagnostic test to detect such phase transiti...
متن کاملModified frame algorithm and its convergence acceleration by Chebyshev method
The aim of this paper is to improve the convergence rate of frame algorithm based on Richardson iteration and Chebyshev methods. Based on Richardson iteration method, we first square the existing convergence rate of frame algorithm which in turn the number of iterations would be bisected and increased speed of convergence is achieved. Afterward, by using Chebyshev polynomials, we improve this s...
متن کاملStochastic Proximal Gradient Descent with Acceleration Techniques
Proximal gradient descent (PGD) and stochastic proximal gradient descent (SPGD) are popular methods for solving regularized risk minimization problems in machine learning and statistics. In this paper, we propose and analyze an accelerated variant of these methods in the mini-batch setting. This method incorporates two acceleration techniques: one is Nesterov’s acceleration method, and the othe...
متن کاملConvergence Analysis of Gradient Descent Stochastic Algorithms
This paper proves convergence of a sample-path based stochastic gradient-descent algorithm for optimizing expected-value performance measures in discrete event systems. The algorithm uses increasing precision at successive iterations, and it moves against the direction of a generalized gradient of the computed sample performance function. Two convergence results are established: one, for the ca...
متن کاملOn the Convergence of Decentralized Gradient Descent
Consider the consensus problem of minimizing f(x) = ∑n i=1 fi(x) where each fi is only known to one individual agent i belonging to a connected network of n agents. All the agents shall collaboratively solve this problem and obtain the solution via data exchanges only between neighboring agents. Such algorithms avoid the need of a fusion center, offer better network load balance, and improve da...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
سال: 2022
ISSN: ['1745-1337', '0916-8508']
DOI: https://doi.org/10.1587/transfun.2021eap1139